Local polynomial regression 
based on functional data 

K. Benhenni^, D. Degras*^ 

"Laboratoire LJK UMR CNRS 5224, Universite de Grenoble 
^ UFR SHS, BP. 47 

Q F38040 Grenoble Gedex 09, France 

fSl ^Statistical and Applied Mathematical Sciences Institute 

,__, 19 T.W. Alexander Drive, P.O. Box I4OO6 

^ Research Triangle Park, NG 27709, USA 

o 

(N 

^ Abstract 

00 






Suppose that n statistical units are observed, each following the model Y{xj) = 
m{xj) + e{xj), j = 1, ..., A^, where m is a regression function, < xi < ■ ■ • < 
xn < 1 are observation times spaced according to a sampling density /, 
I ^1 and £ is a continuous-time error process having mean zero and regular co- 

variance function. Considering the local polynomial estimation of m and its 
^ derivatives, we derive asymptotic expressions for the bias and variance as 

00 7^^ jY —v. 00. Such results are particularly relevant in the context of functional 

Q data where essential information is contained in the derivatives. Based on 

■^ these results, we deduce optimal sampling densities, optimal bandwidths and 

t^ asymptotic normality of the estimator. Simulations are conducted in order 

to compare the performances of local polynomial estimators based on exact 
optimal bandwidths, asymptotic optimal bandwidths, and cross-validated 
bandwidths. 
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1. Introduction 

Local polynomial smoothing is a popular method for estimating the re- 
gression function and its derivatives. Besides its ease of implementation, this 
nonparametric method enjoys several desirable statistical properties such as 
design adaptation, good boundary behavior, and minimax efficiency. See for 
instance the monograph of [8j for a thorough introduction to local polynomial 
methods. In particular, classical kernel methods like the Nadaraya- Watson 
estimator or the Gasser-Miiller estimator are closely connected to local poly- 
nomials, as they correspond to local polynomial fitting of order zero. How- 
ever, kernel estimators do not share the nice properties of higher order local 
polynomials listed above. 

There is a vast literature on the asymptotics of local polynomial regression 
estimators under independent measurement errors. Asymptotic bias and 
variance expressions can be found in [SI I2S1 ES], among others (see also [13] 
for kernel methods). Such expressions give important qualitative insights on 
the large sample properties of estimators. They also allow to find optimal 
theoretical bandwidths and devise data-driven methods for the key problem 
of selecting the bandwidth. See for instance [H |23l |2l] . 

In the case of correlated errors, Opsomer et al. [IH] give an excellent re- 
view of the available asymptotic theory and smoothing parameter selection 
methods in nonparametric regression. Local polynomial estimators are stud- 
ied for instance under mixing conditions in [18j, under association in [l^, and 
more recently under (stationary) short-range dependent errors in Francisco- 
Fernandez and Vilar- Fernandez [12] and [20] • Bootstrap and cross-validation 
methods are developed in [T^ to select the bandwidth in the presence of 
short-range and long-range dependence, while [H] propose a plugin method 
for short-range dependent errors. 

In the functional data setting considered here (that is, when for each sta- 
tistical unit a whole curve is observed at discrete times), several authors have 
studied the estimation of a regression function by means of the Gasser-Miiller 
kernel estimator. For instance, in the case of (continuous-time) covariance- 
stationary error processes. Hart and Wehrly [15] derive asymptotic bias and 
variance expansions and select the bandwidth by optimizing an estimate of 
the integrated mean squared error based on the empirical autocovariance. 
This work is extended to nonstationary error processes with parametric co- 
variance in ^U\ . Benhenni and Rachdi [2], |3] derive asymptotic bias and vari- 
ance expressions when the errors are general nonstationary processes with 



regular covariance. In the context of smoothing sphnes, Rice and Silverman 
[22] propose a cross-validation method for functional data that leaves one 
curve (instead of one time point) out at a time. The optimality properties of 
this method are established in |16j. Asymptotic distributions of local poly- 
nomial regression estimators for longitudinal or functional data can be found 
in [27]. Degras [B] provides consistency conditions for general linear estima- 
tors and builds normal simultaneous confidence intervals for the regression 
function. Studying the local linear estimation of a univariate or bivariate 
regression function, Degras [7] elaborates a Central Limit Theorem in the 
space of continuous functions and applies it to build simultaneous confidence 
bands and tests based on supremum norms. However, no result seems to be 
available in the functional data setting for the nonparametric estimation of 
regression derivatives. 

In this paper we consider the situation where, for each of n statistical 
units, a curve is observed at the same N sampling points generated by a 
positive density in some bounded interval , say [0,1]. The data-generating 
process is the sum of a regression function m and a general error process e. 
We are interested in the estimation of m and its derivatives by local polyno- 
mial fitting. The main contributions of this work are as follows. 
First, under differentiability conditions on the covariance function of e, we 
derive asymptotic expressions for the bias and variance of the local polyno- 
mial estimator as n,N -^ oo. Note that the bias expansions can be found 
elsewhere in the literature (e.g. [26]) as they do not depend on the stochas- 
tic structure of the measurement errors. The variance expansions, on the 
other hand, provide new and important convergence results for the estima- 
tion of the regression function and its derivatives using functional data. In 
particular they highlight the influence of the bandwidth and the covariance 
structure in the first and second order expansion terms. Second, we deduce 
optimal sampling densities (see e.g. [H H] for other examples of optimal de- 
signs) as well as optimal bandwidths in a few important cases (local constant 
or linear fit of m, local linear or quadratic fit of m'). These quantities can 
be estimated in practice by plugin methods. Third, we prove, for inference 
purposes, the asymptotic normality of the estimators. Fourth, we conduct 
extensive simulations to compare: (i) local polynomial fits of different orders, 
(ii) local polynomial fits based on different bandwidths (exact optimal band- 
width, asymptotic optimal bandwidth, and cross-validation bandwidth). The 
simulations use local polynomial smoothers of order p = 0, 1,2 to estimate 
m or m' with different target functions, error processes, and values of n, A^. 



With this numerical study, we try to answer three specific questions: is there 
a better order of local polynomial fit to use in a given scenario? Are the per- 
formances of local estimators based on asymptotic optimal bandwidths good 
enough to justify the development of plug-in methods? Does the naive ap- 
proach that consists in using the cross-validated bandwidth to estimate m^'^-' 
for some u > 1 give reasonable results (note that, in general, cross-validation 
aims to produce good bandwidths for the estimation of m and not m'^'^^)? 

The rest of the paper is organized as follows. The regression model and 
local polynomial estimators are defined in Section 2. The main asymptotic 
results are contained in Section 3. The simulation study is displayed in Sec- 
tion 4 and a discussion is provided in Section 5. Finally, proofs are deferred 
to the Appendix. 

2. Local polynomial regression 

We consider the statistical problem of estimating a regression function 
and its derivatives for a fixed design model. We consider n experimental 
units, each of them having A^ measurements of the response: 

Yi{xj) =m{xj) +ei{xj), i = l,...,n, j = l,...,N, (1) 

where m is the unknown regression function and the e^ are i.i.d. error pro- 
cesses with mean zero and autocovariance function p. 

The observation points Xj, j = 1, . . . ,N are taken to be regularly spaced 
quantiles of a continuous positive density / on [0,1]: 

fi^)dx = j^, j = l,...,N. (2) 

Note that the uniform density / = l[o,i] corresponds to an equidistant design. 
Let Y.j = ^X]r=i^*(^i) t>e the sample average at location Xj and let 
< u < p he integers. For each x G [0, 1], the local polynomial estimator of 
order p of the existing z/th order derivatives m^'^^ (x) of the regression function 
is defined as rhu{x) = i'\f3u{x), where f3u{x) is the z/th component of the 
estimate l3{x) = {/3o{x), . . . , (3p{x)) which is the solution to the minimization 
problem 



min 2, ^-j ~ 2^ (^k{x){xj — x) 
^^' ■—' ^ k=0 





where h denotes a positive bandwidth and i^ is a kernel function. Let 
Y = (Y.i, . . . , Y.nY and denote the canonical basis of M^"*"^ by {ek)k=o,...,p (e/c 
has a 1 in the {k + l)th position and elsewhere). Finally define the matrix 

/ 1 (xi — x) ■ ■ ■ (xi — xy > 

\ 1 (Xiv — x) ■ ■ ■ {Xjy — xY j 

and W = diag (^ K (^^^'^))- Then the estimator rfiyix) can be written as 
m^[x) = v\ e'^3(x), with 3(x) = (X'WX)"^X'WY. (3) 

3. Asymptotic study 

3.1. Bias and variance expansions 

The following assumptions are needed for the asymptotic study of rh^{x): 

(Al) The kernel i^ is a Lipschitz-continuous, symmetric density function 
with support [—1,1]. 

(A2) The bandwidth h = h^u, n, N) satisfies /i — )■ 0, Nh"^ — )■ oo, and nh"^" — )■ 
oo as n, A — )> oo. 

(A3) The regression function m has {p + 2) continuous derivatives on [0, 1]. 

(A4) The sampling density / has one continuous derivative on [0, 1]. 

(A5) The covariance function p is continuous on the unit square [0, 1]^ and 
has continuous first-order partial derivatives off the main diagonal. 
These derivatives have left and right limits on the main diagonal deter- 
mined by p*^'^'^^(a;,x~) = lirayy^x. P^'^'^Kx,y) and p^'^'^\x,x'^) = \iray\^,_i. P^'^'^\x,y). 

We now introduce several useful quantities associated to K. Let fit = 
f_^ u^K(u)du be the fcth moment of K and the vectors c = {pp+i, . . . , /i2p+i)' 
and c = (/ip+2, • • • ,/i2p+2)'- Let S = {pk+i), S = (pk+i+i), S* = {pkPi), and 
^ ~ [^ //[-1 112 \u — v\u'^v^K{u)K{v)dudv\ be matrices of size (p+l)x(p+l) 
whose elements are indexed by /c, / = 0, . . . ,p. 

The asymptotic bias and variance of the estimator rhv{x), for a given 
X G (0, 1) and z/ G {0, . . . ,p}, are established in the following theorem. 



Theorem 1. Assume (Al)-(A5). Then as n,N -^ oo, 

1 (p + 2)! ' (p+1)! /(x) V '^ '^ ;/ 



+2-V 



and 



Var(mi,(x)) = — e^S S S e„ + o ' 






ii^lY 



Remark 1. The bias expansion of Theorem [T] does not depend on the nature 
of the measurement errors (continuous time processes). A similar expansion 
can be found e.g. in [H] in the context of independent errors. Also, the 
present variance expansion extends the results of [31 [15] on the nonparametric 
estimation of the regression function with the Gasser-Miiller estimator. 

Remark 2. The reason for presenting second-order expansions in Theorem 
[T]is that first-order terms may vanish due to the symmetry of K which causes 
its odd moments to be null. For instance, the first-order terms in the bias 
and the variance vanish, respectively, whenever p — z/ is even and v is odd. 
In both cases, the second-order terms generally allow to find exact rates of 
convergence and asymptotic optimal bandwidths as in Corollary |4j 

If the covariance p has continuous first derivatives at (x, x), the second- 
order variance term in Theorem hi vanishes since p*^°'^''(x, x~) = p*-°'^^(x,x^). 
Thus, the variance expansion does not depend on h when z/ = or z/ is odd 
(see Remark pi). This makes it impossible to assess the effect of smoothing on 
the variance of rhu{x) nor to optimize the mean squared error with respect to 
h. This problem can be solved by deriving higher-order variance expansions 
under stronger differentiability assumptions on / and p. As general higher- 
order expansions are quite messy and difficult to interpret, we restrict our- 
selves to the central case of an equidistant sampling design with / = l[o,i] . For 
this purpose, we introduce the matrices Ai = [^(pkPi+2 + Pk+2P'i)) , A2 = 
{pk+iPi+i), and A3 = (|(/Xfc+3At/+i + Pk+iPi+s)) indexed by A;, / = 0, . . . ,p. 



Theorem 2. Assume (Al)-(A5) with f = 1 on [0,1] (equidistant design). 

• Case V even. Assume further that p is twice continuously diff'erentiable 
at {x,x) and Nh^ — > oo as n,N ^ oo. Then 

Var(m,,(a;)) = — e^^S S S e,. 



Case V odd. Assume further that p is four times continuously differen- 
tiable at (x, x) and Nh^ — )■ oo as n,N ^ oo. Then 

Var(m.(x)) = ('^OV^'^H-^,:^) ^. g-i^^g-i^^ 






Remark 3. The function m^"^ can be estimated consistently without smooth- 
ing the data in model ([I]). Interpolation methods would also be consistent, 
as can be checked from Degras [B]. Moreover, looking at Theorems [I] and |2l 
it is not clear whether the variance of rny[x) is a decreasing function oih. In 
other words, smoothing more may not always reduce the variance of the es- 
timator. See Cardot ^ for a similar observation in the context of functional 
principal components analysis. 

3.2. Optimal sampling densities and bandwidths 

In this section, we discuss the optimization of the (asymptotic) mean 
squared error 

MSE = E(m^(x) - m^''\x)y 

= Bias(m,^(a;))^ + Var(mi,(x)) 

in Theorem [l] with respect to the sampling density / and the bandwidth h. A 
similar optimization could be carried out in Theorem [2] where the covariance 
function p is assumed to be more regular (twice or four times differentiable). 

We first examine the choice of / that minimizes the asymptotic squared 
bias of rhy{x) since that the asymptotic variance of rhy{x) is independent of 



/, as can be seen in Theorem [T} This optimization may be useful in practice 
especially when the grid size A^ is not too large and subject to a sampling 
cost constraint. 

For p — u even, e'^^S^-'-c = so that the first-order term in the bias 
vanishes, as noted in Remark [2j Moreover, the second-order term can be 
rendered equal to zero (except at zeros of m^P^^\x)) by taking a sampling 
density / such that gp,u{x) = 0, where 



"'■•■'•'" (p + i)'."-" '+ (p + l)! fix) 
The solution of the previous equation is 

Ux) = d^'\m(r^^'\x)r^^'^'\ (4) 

with do such that L fo(x)dx = 1 and 7 = t-, — , -^^ , — —, — r^. Observe that 

^ JO •'^v y / (^e;,S-iSS-ic-e(,S-icj 

/o(x) is well-defined over [0, 1] if and only if m^^'^^\x) 7^ for all x G [0, 1]. 

With the choice / = /o, the bias of rhi,{x) is of order olh^^"^^"), so 
that a higher order expansion would be required to get the exact rate of 
convergence. In practice, the density /o depends on the unknown quantity 
m^p+'^)(^x). However, an approximation of fo{x) can be obtained by replacing 
in dil) the derivative m^^'^^\x) by a local polynomial estimator m(p+i)(x). 

For p—u odd, the first-order term in the bias is non zero (if m^'P~^^^ (x) 7^ 0) 
but does not depend on /. On the other hand, the second-order term vanishes 
for any sampling density f{x). Therefore, a higher order expansion of the 
bias would be required to get exact terms that depend on f{x) and could 
then be optimized. 

We turn to the optimization of the bandwidth h and start with a useful 
lemma whose proof is in the Appendix. 

Lemma 3. Assume (A5) and define a{x) = p^^'^'(x,x^)—p^^'^\x,x^). Then 
a{x) > 0. 

This lemma is easily checked for covariance- stationary processes (see e.g. 
Hart and Wehrly [15]) but is less intuitive for general covariance functions p. 
It can be helpful in determining whether the asymptotic variance of rhy{x) 
is a decreasing function of h, in which case the MSE can be optimized. 
More precisely, in order to derive asymptotic optimal bandwidths throughout 
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this section, we need to assume that a{x) > (or assume higher order 
differentiabihty for p if a{x) = 0; see Remark [s]). 

When estimating the regression function itself, z/ = 0, the leading variance 
term in Theorem hi does not depend on h. If eQS~^AS~^eo < 0, then the 
second-order variance term (in h/n) is nonnegative and the optimization 
of the MSE yields the solution h = 0, which is not admissible. In fact, 
we suspect that eQS~^AS^^eo > for all kernels K satisfying (Al) and 
all integers p > 0, although we have only checked it for the special and 
interesting cases p < 2 (local constant, linear, or quadratic fit) but cannot 
provide a proof for more general p. Proceeding with this conjecture, the 
optimal bandwidth for the MSE exists and can be obtained from Theorem 
[TJ However, some caution must be taken to separate the cases p even and 
p odd, for which the bias expressions are different (see Remark M and the 
optimization of / above). More precisely, if z/ = and p is odd, then the 
asymptotic optimal bandwidth is 



(p + l)!2(e'oS-iAS-%)a(x) 



l/(2p+l) 



""' \ (2j9 + 2) (m(p+i)(x))' (e'oS-ic)' J 

In the case where i/ = and p is even, the asymptotic optimal bandwidth 
becomes 

_f (e-S-^AS-^eo)a(x) V/^^^+^^ ,(,,^3, 

°'*"l (2p + 4)^,,o(x)^ / 
Note that in the above optimization, it is assumed that Qp^ix) ^ 0, that 
is, / is different of the optimal sampling density /q. To optimize the MSE 
when the optimal density /o is used, then it would be necessary in this case 
to derive a higher order expansion for the bias, and it can be shown that the 
optimal bandwidth would then be of order at least 77,~i/(2p+5)_ 

In the following corollary, we give the optimal bandwidth h in two im- 
portant cases (z/ G {0, 1}), using, for simplicity, a uniform sampling density 
/ = 1 on [0, 1]. Optimal bandwidths can be obtained similarly for u > 2. 

Corollary 4. Assume (A1)-(A5) with f = 1 on [0, 1] and a{x) > 0. 

1. Local constant or linear estimation of m (u = 0, p E {0, 1}). Assume 
further that m"(x) ^ and Nn~'^''^ — )■ oo as n, N ^^ oo. Then the 
optimal bandwidth for the asymptotic MSE of rho{x) is 

"- = {201^ IL I" - "I A»A-(..)*<<«) '" «-"^ 



2. Local linear or quadratic estimation of m' (z/ = 1, p G {l,2}j. Assume 
further that m^^\x) ^ and Nn^"^^^ — )■ oo as n, N ^ oo. Then the 
optimal bandwidth for the asymptotic MSE of mi{x) is 

Corollary |4] provides the theoretical basis for a plug-in method to select h. 
Developing such a method and studying its theoretical properties is however 
beyond the scope of this paper. In Section |4], the optimal bandwidths of 
Corollary |4] are used as benchmarks to assess the popular cross-validation 
procedure. 

Remark 4. In the cases (z/ = 0,p = 1) and (z/ = l,p = 2) of Corollary El 
the results actually hold for any sampling density / satisfying (A4). Also, 
the first part of the corollary corresponds to Theorem 3 of [15] and Corollary 
2.1 of [3] when the Gasser-MuUer estimator is used along with an equidistant 
sampling. 

Remark 5. Under the assumptions of Theorem |2| in case 1 of Corollary 111 

1 /9 — — 

the optimal bandwidth is hopt = ( — ^ m"(xf^ ) n~^^'^ if p'^°'^)(x, x) < and 
jY^-3/2 _i. QQ g^g ^^ jY _i. QQ^ otherwise the optimization will not be possible. 

1 /9 

Likewise, in case 2, the optimal bandwidth is hopt = ( — ^^m(3)(x')'2 ) n'^^'^ 
provided that p^^'^\x, x) < and Nn^^^"^ -^ oo as n, N -^ oo. 

Remark 6. Theorem 1 can also be harnessed to derive optimal bandwidths 
for global error measures such as the integrated mean squared error 

IMSE= f E {m^{x) - m^''\x)Y w{x)dx (5) 

Jo 

where w is a bounded, positive weight function. More precisely, denoting 
by [— r, r] the support of K, the bias and variance expansions in Theorem 
1 hold uniformly over [r/i, 1 — rh], and their convergence rates are main- 
tained in the boundary regions [0, rh) and {l — rh, 1] (only the multiplicative 
constants are lost). As n, N ^ oo, the IMSE is therefore equivalent to the 
weighted integral over [0, 1] of the (squared) bias plus variance expansions of 
Theorem 1. One can thus replace the terms a{x) and (m^'^\x)y in Corollary 
kJby Jq a{x)w{x)dx and /^ (m*^'^^(x))^w(x)(ix, respectively, to obtain global 
optimal bandwidths. 
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Remark 7. For the implementation of the (global) optimal bandwidths cor- 
responding to Corollary 4, the integral /^ a{x)w{x)dx can be estimated by 

Vn = ^ ^r=i ^1=2 (^«(^i) ~ ^«(^j-i)) "^(^i)) namely the quadratic varia- 
tions of the sample processes 1^. The previous estimator is almost surely 
consistent as n, A^ — >■ oo; see e.g. ^T\ in the case of Gaussian processes. 

3.3. Asymptotic normality 

The asymptotic bias and variance expansions of Theorems [l] and |2] provide 
the centering and scaling required to determine the limit distribution of the 
estimator rhy^x). Besides, one may observe that rhy{x) = ^ X^iLi '^!^,*(^)' 
where the TJi^^i are the local polynomial smoothers of the curves Yi, i = 
l,...,n. Since the rhi,^i{x)^s are i.i.d. with finite variance as the Y^'s, the 
Central Limit Theorem can then be applied to m^{x) a.s n,N — )■ (X). We 
consider the asymptotic distribution of rhu{x) according to the parity of u 
(see Remark [2] on the vanishing terms in the asymptotic variance) in order 
to get the correct scaling term, and also to ensure that the bias term is 
asymptotically negligible when multiplied by the scaling rate. The latter 
is guaranteed by imposing extra conditions on the bandwidth h. Denoting 
the convergence in distribution by — )■ and the centered normal distribution 
with variance a"^ by M(0, a^), the limit distribution of rhi,{x) is given by the 
following result. 

Theorem 5. Assume (A1)-(A5). 

• Case V even. Assume further that nh"^^^^ -^ if p is even, resp. 
^^2p+2 _^ Q j^j p j^g ^^^^ as n, A^ —i- OO. Then 

V^M^{m^{x) - m^''\x)) A U (O, {u\fp{x, x)(e;S-^S*S-^e^)) . 

• Case V odd and a{x) > 0. Assume further that nh'^^^^ — t- ifp is even, 
resp. nh'^^'^^ -^ if p is odd, as n,N ^ oo. Then 

^ nh^^-\m^[x) - m(^)(x)) A M (O, [v\fa[x)\^J&-^K'&-^Gj\) . 

• Case V odd and a{x) = (i.e. p is continuously differentiable in a 
neighborhood of {x,x)). Assume further that p is four times differen- 
tiable at {x,x), that Nh^ — )■ oo, and that nh'^^ — > if p is even, resp. 
^f^2p+2 _^ Q j^j- p j^g g^^^ as n, A^ —> oo. Then 

Vnh^-^im,{x) - m(")(x)) A ^f (O, iu\Yp^^^'\x, x)ie'^S-^A2S-^e,)) . 
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4. Numerical study 

In this section we compare the numerical performances of local polynomial 
estimators based on several global bandwidths. Specifically, we examine 
the local constant and linear estimation of a regression function m and the 
local linear and quadratic estimation of m'. We consider three types of 
bandwidths: (i) the global versions of the asymptotic optimal bandwidths 
of Corollary 111 denoted by has', (ii) the bandwidths that minimize the IMSE 
on a finite sample with the target m or m' and weight function w = 1 in 
(Is]), which are called exact optimal bandwidths and denoted by h^x', (iii) the 
popular "leave-one-curve-out" cross-validation bandwidths, denoted by hcv 
The interested reader may refer to Rice and Silverman [22] and Hart and 
Wehrly [16] for a detailed account and theoretical justification of this cross- 
validation method. In short, this method selects the smoothing parameter 
for which the estimator based on all observed curves but one predicts best the 
remaining curve. In model rtl]), considering the local polynomial estimation 
of m by rho in (pi), hcv is obtained by minimizing 

n N 2 

cv(/^) = ;^ E E H^^Ha^,; h) - F.(x,)) (6) 

where ttiq (■;/?.) is the local polynomial smoother of order p and bandwidth 
h applied to the data Y^^Xj), k G {1, . . . ,n} \ {i}, j = 1, . . . ,N. Although 
the cross-validation score (|6| is designed for the estimation of m, it is of 
interest to see how it performs for the estimation of derivatives m*-''\ u > 1. 
This question is particularly justified in local polynomial fitting, where all 
derivatives of m up to order p are being estimated simultaneously. 
The regression functions used in the simulations are 




16(x-0.5) 



4 



1 (7) 

+ 0.03sin(67rx). ^ ^ 



1 _|_ g-10{x-0.5) 



The polynomial function mi has unit range and has relatively high curvature 
away from its minimum at x = 0.5. The function m2 is a linear combination 
of a logistic function and a rapidly varying sine function. The factor 0.03 is 
chosen so that the sine function has small influence on m2 but a much larger 
on m'2 . These functions and their first derivatives are displayed in Figure flj 
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For the stochastic part of (fTl) we consider Gaussian processes with mean 
zero and covariance functions 

pi{x,y) = mm{x,y), ,. 

The first error process is a standard Wiener process on [0,1]; the second 
is a stationary Ornstein-Uhlenbeck process. The parameter A = 15 in p2 
allows to inspect various correlation levels between two consecutive mea- 
surements (e.g. 0.22 for A^ = 10 and 0.86 for A^ = 100). The variance 
levels of these processes are chosen so that the signal-to-noise ratios (SNR = 
{maXi m{t) — min^ m(t)} / {n^^^ J p{t, tY^'^dt}) are fairly low for n small and 
high for n large. For instance, when n = 10, the SNR is 3.16 with the 
Ornstein-Uhlenbeck process and 6.32 with the Wiener process. A SNR be- 
tween 4 and 5 corresponds to a moderate noise level in [15] . In the estimation 
of derivatives, the influence of measurement errors is much stronger. 

The simulations were conducted in the R software environment. All four 
combinations of 1711,1712 and pi, P2 were studied with experimental units n 
and sampling design size A^ varying in {10,50,100}. Different estimation 
targets (mi^ , v = 0,1) and local polynomial estimators were considered 
(p = 0, 1 for 1/ = 0, i.e. local constant and linear fits, and p = 1,2 for z/ = 1, 
i.e. local linear and quadratic fits). In each case, 1000 instances of model 
([I]) were simulated. The kernel K was a truncated Gaussian density and 
the global bandwidths h considered were the exact optimal bandwidth h^x 
(obtained by minimizing the true IMSE with w = 1; see Remark [6]), the 
asymptotic optimal bandwidth has (see Corollary 111 and Remark |6]), and the 
cross- validated bandwidth hcv 

Some of the extensive simulation results are presented in Tables [l}]5j In 
each table, columns 3-4 are the exact and asymptotic optimal bandwidths; 
column 5 is the median cross- validated bandwidth over 1000 simulations; 
columns 6-7-8 are the median L^ estimation errors f^ (m,^(x) — m^'^\x)Ydx 
(first and third quantiles are between brackets) with the exact optimal, 
asymptotic optimal, and cross-validated bandwidth over the 1000 simula- 
tions. As the estimation errors are strongly right-skewed and feature out- 
liers, these errors are described in terms of quantiles rather than mean and 
standard deviation. 
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We first comment the results on the estimation of m. Looking at Tables 
[I] and [2] (local linear estimation of mi with covariance pi or P2), it appears 
that the bandwidths hex, has, hcv are very close and yield similar perfor- 
mances for almost all n, N. However, note in Table [T] that has yields smaller 
performances when n = 50, 100 and A^ = 10, which can be expected since 
this bandwidth is only optimal for large A^. Also, local constant estima- 
tion of TUi or 7712 (not displayed here) yields very similar results to local 
linear estimation. In Table Is] (local linear estimation of m2 with covariance 
P2), the bandwidths hex and hcv can be infinite for n = 10. This remark- 
able fact has two reasons. First, the shape of 7n2 is close to linear, which 
means that when there are few design points, increasing the bandwidth h of 
the estimator only increases its bias marginally (the local linear estimator 
is unbiased for linear functions). On the other hand, under the Ornstein- 
Uhlenbeck noise, the estimator's variance reduces drastically as h increases 
(much more so than under the Wiener noise). This can be seen in Theorem 1 
where the corrective term in the variance is —^^Y^h JJ \u — v\K{u)K(v)dudv, 
with a{x) = p^'^'^\x,x~) — p^^'^\x,x^) = 30 for the Ornstein-Uhlenbeck 
covariance p2 and only a{x) = 1 for the Wiener covariance pi. 

»» INSERT TABLES 1-2-3 ABOUT HERE «« 

We now turn to the results concerning the estimation of the derivative m' 
in Tables 4 and 5 along with Figure 2. Over the simulations, the use of hex 
appears to sensibly reduce the L^-error in comparison to has and hcv For 
the local linear estimation of 7n', the reduction is 13% and 16%, respectively 
(median reduction in L^-error over all combinations of n, N, rrii, and pi). The 
higher performance of hex over has and hcv (and of has over hcv when A^ > 50) 
can be observed in Tables |4]J5] in the case of the regression mi and covariance 
pi. In fact, similar comparisons hold for all choices of 7n and p. For the 
local quadratic estimation of m', the use of hex and hcv reduce the L^-error 
by respectively 45% and 30%, in comparison to has (over all combinations 
of n, A^, TTij, and pi). It is noteworthy that has is systematically smaller than 
hex (the difference between the two bandwidths is larger when estimating m' 
than when estimating m) but the two bandwidths has and hex are closer for 
p = 1 (local linear fit) than for p = 2 (local quadratic). 
Comparing local linear to local quadratic estimation, the latter can consid- 
erably reduce the bias at the expense of increasing the variance, which is a 
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consequence of adding an extra (quadratic) parameter in the local fit. Which 
order of local polynomial fit achieves better performances in a given scenario 
depends on the balance between bias and variance. It can be seen from Ta- 
bles |4] and [5] that when h^x is used in the simulations, the local quadratic 
estimator yields sensibly better results than the local linear when the target 
is m[ (due to the fairly high curvature of m'^ which makes the bias large 
in comparison to the variance). The situation is however reversed with the 
target m'g (that has relatively low curvature), as shown in Figure [2] In this 
figure, the local quadratic estimator has a shghtly smaller (squared and in- 
tegrated) bias than the local linear for small h (see left panel). On the other 
hand, the local linear estimator has much smaller variance than the quadratic 
for all h (middle panel). Overall, the optimal IMSE is smaller for the local 
linear estimator and the optimal bandwidths are quite different (right panel), 
hopt ~ 0.13 for the linear one and hopt ~ 0.30 for the quadratic one. 

»»» INSERT TABLES 4-5 ABOUT HERE ««« 



»»» INSERT FIGURE 2 ABOUT HERE ««« 



5. Discussion 

We have examined in this paper the local polynomial estimation of a 
regression function and its derivatives in the context of functional data. Our 
main theoretical contribution has been to derive second-order asymptotic 
expansions for the bias and variance of the estimator based on a sampling 
density not necessarily uniform. These expansions give qualitative insights 
on the large-sample behavior of the estimators and highlight in particular 
how the covariance and the choice of the bandwidth affect the estimator's 
variance. Our result fills an important gap in the literature as, to this date, no 
asymptotic theory seems available on the estimation of regression derivatives 
with functional data under correlated errors. This topic is relevant in practice 
since for many functional data sets, essential information may be carried by 
derivatives of the observed curves. Note that our results may be extended to 
the multivariate regression setup and also to noisy functional data. 
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We have applied our main result to determine optimal sampling densi- 
ties and bandwidths that can be estimated in practice by plugin methods. 
To examine the potential benefits of a plugin method for bandwidth selec- 
tion, we have compared numerically the performances of local polynomial 
estimators based on the asymptotic optimal bandwidth has (nessary for the 
plugin method), the exact optimal bandwidth hg^, and the cross- validated 
bandwidth hcv of Rice and Silverman |22j. The simulations indicate that a 
plugin method could be an interesting alternative to cross-validation for data 
sets with moderate to large numbers of observation points (which is typically 
the case for functional data), especially for estimating the derivatives of the 
regression function m. Developing a plugin method would however require 
to estimate the partial derivatives of the covariance function p and some 
higher-order derivative of m. Another outcome of the simulations is that 
although cross-validation is not meant for derivative estimation, estimators 
based on hcv generally give satisfactory results both when the target is m and 
m'. Finally, our simulations suggest the use of local linear fits both for esti- 
mating m (rather than local constant) and m' (rather than local quadratic), 
as these estimators are more stable (especially for small sample sizes A^) and 
give reasonable estimates in most situations. 

Finally, we have established the asymptotic normality of the local poly- 
nomial estimator in the pointwise sense. This result can be applied in various 
inference procedures. By following the arguments of [7j, a stronger asymp- 
totic normality result can be obtained in the space of continuous functions 
equipped with the sup-norm. This allows to conduct simultaneous inference 
on the regression derivatives. 

Appendix: Proofs 

Throughout the proofs, the dependence of vectors and matrices on A^ is 
denoted explicitly to clarify the arguments. Also, to fix ideas, the compact 
support of K is taken to be [—1, 1] without loss of generality. 

Proof of Theorem 1: bias term 

Let us write niAr = (m(xi), . . . , m{xN))' and define the (p + 1) x (p + 1) 
matrix Sat = A^^^X'^yWatXtv with (A;, /)th element {0 < k,l < p) given by 



N 
i ' 

Sk+l,N 



Lp.,-.r'K{^). 



Nh 
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It follows from ([S]) that 

E(3^(x)) = iV-iS^iX'^W^m^. (9) 

With (A3), a Taylor expansion of m{xj) at the order {p + 2) yields 

m{xj) = m(x) + {xj —x)m!{x) + . . . H j — -j — nv'^^'^\x) +o [{xj — x)'^^^) 

(p + 2). 



and thus 



niTv = Xjv/3(x) + /3p+ 



/ (xi - x)P+^ \ 



'p+1 



/ (Xi-x)P+2 \ 



+ (/3p+2 + o(l)) 



\ (a^TV 



X 



|P+2 



/ 



y (x^ - x)p+i y 

Hence the bias in the estimation of /3(x) is 

E(3^(x)) - /3(x) = /3p+iS^^cjv + (/3p+2 + o(l)) S^v'c^, (10) 

where cat = (sp+i,Ar, . . . , S2p+i,n)' and ca? = {sp+2,N, • • • , S2p+2,Af)'- 

With the regularity of the sampling design (|2|, the Lipschitz-continuity of 
K and the compacity of its support, some straightforward calculations allow 
to approximate the elements of the matrix Sat by 

Sk+i,N = h'+' ( r u'+'K{u)f{x + hu)du + 0{{Nh)-^)) , (11) 

the 0{{Nh)~^) being the error in the integral approximation of a Riemann 
sum. 

From assumption (A4), a Taylor expansion of /(x + hu) at order 1 yields 

Sk+i,N = h^^^ {l^k+if{x) + h^ik+i+if{x) + o{h)) (12) 

under the condition Nh'^ — )■ oo, which in matrix form stands as 

Sjv = H {f{x)S + hf{x)S + o{h)\ H , 

where H = diag{l, h,- ■ ■ , h^). In particular, it holds that 



(13) 



c^ =/iP+iH(/(x)c + (/i/'(x) + o(/i))c), 
c^ =hP+^H{f{x) + o{l))c. 



(14) 
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Now, due to the fact that (U + hV)-^ = V'^ - hV-^YV-^ + o{h) for 
any two invertible matrices U, V of compatible dimensions, we have 



-1 _ XT- 1 / ^ c-1 _ /^/ ^^) c-lcc-1 I „iu\\u-l 



S^ =H 



/(^) 



P{x) 



S-^SS-^ + o{h) H 



(15) 



Plugging (14)-(15) in (10) and truncating the expansion to the second 



order, the bias expression of Theorem [T] follows. 

Proof of Theorem 1: variance term 

Define the N x N matrix S^v = {p{xi,Xj)) and the (p + 1) x (p + 1) 
matrix S^ = A^-^X'^W^vStyW^X^. Noting that Var(Y) = n-^S^ and 
considering ([s]), it can be seen that 

Var(^^(x)) = n-'S],'S*^S],' . (16) 

The asymptotic behavior of the matrix S^ is given by the following 
lemma. 

Lemma 6. Assume (A1)-(A5) for a given x G (0, 1). Then as n,N ^ oo, 



'AT 



H {0(x, x)S* + /i(0(°'^)(x, x+) - 0(°'^)(a;, x'))A 

+h{(f)^'''^\x, x+) + 0(°'^)(x, x-))B + o(/i)} H 



with A, S* being defined in Section 3, B = {\{iJik+il^i + l^klJ'i+i)) , and(j){y, 
P{y,z)f{y)f{z). 



Plugging Lemma p\ and ( 15 ) in ( 16 ) , we have 
nf{xfHVaT0{x))H = 0(x, x) 8-^8*8-^ + o{h) 



h(f){x,x] 



fix) 
fix) 



f8"^S8"^8*8"^ + 8"^8*8"^S8"^^ 



+ h (0(°'i)(x, x+) - (P^'^'^^x, X-)) 8-^A8-^ 
+ h (0(°'i)(x,x+) + (j)^''''\x,x-)) 8-iB8-\ 



(17) 



Note that the o{h) above stands for a matrix whose coefficients are negligible 
compared to /i as /i — )> 0. 



Expressing 0^°'^^(x,x^) in terms of p*^°'^''(x,x^), we get 

0(0.1) (a;,x^) = f{x)f'{x)p{x,x) + f{x)p^^'^\x,x^) (18) 

and then 

nHVar(3(x))H = p{x, x)S"^S*S"^ + o{h) 

- hp{x,x) Q^ (s-^ss-is*s-i + s-is*s-^ss-i) 

^^"^^ (19) 

+ /i(p(0'i)(x,x+)-p(°'^)(a;,a;"))S-^AS-^ 

+ h U^pix, x) + (p(°'^Ha^, ^+) + P^°''Ha;, x-))\ S-^BS-^ 

The variance expression can further be siniphfied due to the fact that 

Gj^o bb 55 55 Gj^ = (J (or\\ 

e',S-iBS-ie, = ^ ^^ 

for all z/ = 0, . . . ,p. To see this, we need to examine in detail the above 
matrices. By the symmetry of i^, S = (pk+i) has its {k,l)th entry equal to 
zero if k, I are of different parity. The same property can be established for 
S^^ by standard cofactor arguments. For S = (pk+i+i), the (fc, /)th entry 
is zero if k,l are of the same parity. For S* = (pkfJ'i), the sparsity is even 
stronger: all the rows, columns, and subdiagonals of odd order (recall that 
the indexing starts at 0) have their entries equal to zero. With some matrix 
algebra, one can check that the matrices S~^SS~^ and S*S~^ have the same 
sparsity structures as S and S*, respectively. It is then easy to obtain the 



first part of ( 20 ) . The second part is derived along the same lines. It suffices 
to notice, on the one hand, that e'^^S^^BS^^e;^ can be written as the double 
sum ^^ I [S~^e,^]^ [S~^ej,]; B^i over the indexes k, I having the same parity as 
z/. On the other hand, Bki = PkPi+i + Pk+iPi = for k, I both even or both 
odd (pfc = pi = ii k,l odd and pk+i = Pi+i = ii k,l even). Combining 
these two facts yields the asymptotic result for the variance term. 



Finally, we deduce from (19) and (20) that 
nVai{m^{x)) = ra(z/!)^ e'^, Var(/3(x)) e,^ 



(z/!)'/i-2^ p{x, x) e;S~^S*S-'e, + o{h-^''+^) 



+ (z/!)2/i-2-+i (p(0'i)(x, x+) - p(°'i)(a;, X-)) e^S-^AS-^e,, 
which completes the proof of Theorem 1. D 
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Proof of Lemma \^ 

By the same arguments used to approximate the matrix Sat with integrals 



in (11), one can use the regularity (A5) of the covariance function p to show 



that the elements of S^ satisfy 

N N 



= h-^ [[ {u - x)''iv - xYK ('^^] K {""-^^ p{u, v)f{u)f{v)dudv 



JJ[-1,1]2 V '* / V ^ 

= /i^+' ff u^v^<p{x + hu,x + hv)K{u)K{v)dudv + o(/i^+'+^), (21) 

J J [-1,1? 

assuming that {Nh)~^ = o{h), i.e. Nh"^ — t- oo. 

Using Taylor expansions together with (A4)-(A5), one can show that 

(j){x + hu, X + hv) = (j){x, x) + hu(j)^'^'^\x, x~) + hv(f)^^'^'{x, x~^) + o{h) 

for all < M < f < 1. This expansion is obtained by introducing a pivotal 
point {x + hu, x) or (x, x + hv) such that the lines connecting this point to 
(x + /iM, X + hv) and (x, x) do not cross the main diagonal of [0, 1]^. One can 
then safely perform Taylor expansions along the connecting lines, knowing 
that is differentiable on each side of the diagonal. The above expansion 
also relies on the identities (j)'^^'^\x^ ^x) = (l)^'^'^\x,x~^) = 0'^°'^)(x~,x) and 
(f)^^'^\x~ , x) = 0*^°'^)(x,x^) = (p^^'^^x'^ , x) (thanks to the symmetry of 
and the continuity of the first partial derivatives of on either side of the 
diagonal). By symmetry considerations, it then holds for all u,v E [0, 1] that 

(f){x + hu,x + hv) = (j){x,x) + h{u Av) (f)^^'^'{x,x^) 

+ h{uVv)(l)^^^^\x,x+) + o{h). 

Using the fact that {u A v) + {u \/ v) = u + v and {u\/ v) — {u A v) = 
[m-wI, writing 0(°'i)(x,x=^) = ^ (0(°'^)(x,x+)+0(°'^)(x,x"))±|((/)(°'^)(x,x+)- 



20 



x,x )), one concludes, by the dominated convergence theorem, that 



.i)i 



s*, ^ = /i'=+' / / u^v^K{u)K{v)<i){x + hu,x + hv)dudv + o(/i'=+'+^) 

( h 

= /l''+'|0(x,x)/ife/i, + - (0(°'^)(X,X+) +0(°'^)(X,X~)) (/ifc+i/i, +/ifc/ii+i) 

+ ^(0(°'i)(x,s+)-0(°'i)(x,x-)) /"/" |m - w|MVir(u)ir(t;)dMdi;| 

2 ii[-l,l]2 J 

Proof of Theorem [^ 

This result is obtained along the lines of the proof of Theorem [TJ More 



precisely, it suffices to push the matrix expansions of S^y^ in (15) and S 



'N 



in Lemma |6] to a higher order d. First, since / = 1, it is easily seen that 



Sn = {1 + o(/i°')}HSH provided that Nh'^'^^ — )■ oo. Therefore, (15) simply 
extends in S^^ = {1 + o(/i'^)}H^^S^^H~^ Second, if the covariance p is 
d times differentiable at {x,x), then a Taylor expansion of order d can be 
performed for p{x + hu, x + hv), followed by an application of the dominated 
convergence theorem over [—1, 1]^ as /i — )• 0. For d = 4, we get for instance 
(see the proof of Lemma |6|): 



4^ ^ = h''+^ / / u''v^K{u)K{v)p{x + hu,x + hv)dudv + o(/i^+'+^) 

= h''+^p{x,x)pkf^i + hp^°'^\x,x) {pk+m + fJ'kfJ'i+i) 



+ h^ ( p*^°'^) (x, x) ' , + p^^'"^^ (x, x)pk+ipi+i 

+ h' fp(°'3) (x, x) ^^+^^' + ^^^^+^ + p(i'2) (x, x) ^'=+'^'+^ + ^'=+^'''-^' 



3! ■ r V , . 2! 

+p''''\x,x)^^^f^)+oih^)}. (23) 

The arguments used in Theorem [T] relative to the sparsity structure of 
S^^ and the limit matrix of S^ still apply here. In a nutshell, the matrices of 



the form [pk+al^i+h) in (23) that do contribute to the limit variance of m^{x) 



are those for which both u + a and u + b are even. (This corresponds to the 
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nonzero moments of the kernel K.) Therefore, the terms of order h and h^ 



inside the brackets of (23) do not contribute to the hmit variance of •fhi^{x) 



For 1/ even, the terms Hk+ifJ-i+i in (23) do not contribute either but the terms 
yUfc/i/ and /i^(/ifc+2/^/ + fJ'kfJ^i+2) do. An expansion to order d = 2 is thus 
sufficient. For u odd, only the terms h"^ iik+il^i+i and h'^{fj,k+3fi'i+i +/^fc+i/^/+3) 
contribute to the limit variance of rhi,{x) up to order 4. In this case the 
expansion to order d = 4 is necessary, as an expansion to order 2 only results 
in a variance term of order 1/n (independent of h) when u = 1. Theorem [2] 
immediately follows from these arguments. D 

Proof of Lemma \^ 



Starting from the Taylor expansion (22) and the subsequent argument in 



the proof of Lemma |6} it can be shown that 

h 

p{x + hu, X + hv) = p{x, x) -\ — (^p^^'^'{x, x^) + p^^'^'{x, x^)) {u + v) 

h ^24) 

+ 2 (P^°''H^, a;+) - p(°'i) (x, X-)) \u-v\+ o{h) 



for all u,v e [—1, 1]^ as /i — )■ 0. 

Let us write a = p^°'^H^,^+)+p'°>-'(x,x-) ^^^ ^ ^ .io^K.^.^^^^io,,^.,.') ^^^ 

brevity. The dominated convergence theorem and (A5) imply that for any 
bounded, measurable function g on [—1, 1], 

p{x + hu, X + hv)g{u)g{v)dudv 

1,1]2 

= p{x,x) I / g{u)du\ + 2ah / g{u)du I vg{v)dv (25) 



+ bh g{u)g{v)\u — v\dudv + o{h). 

J J [-1,1? 

The left handside of (25) is non-negative since the covariance p is a non- 
negative definite function. By taking g = Id[_i^i], we have J_-^ g{u)du = so 
that the remaining term bh jj,_^ ^-,2 g{u)g{v)\u — v\dudv in the right handside 



of (25) is also non-negative. Since JJ,^-^pUv\u — v\dudv = — ^ < 0, this 
means that 6 < and hence a{x) = p^^'^\x,x^) — p^'^'^\x,x~^) > 0. D 
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Figure 1: Left panel: regression functions mi (solid line) and m2 (dashed line). Right 
panel: first derivatives m[ (solid) and ttij (dashed). 
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(0.002-0.013) 


50 


100 


0.03 


0.04 


0.03 


0.006 


(0.002-0.012) 


0.006 


(0.002-0.012) 


0.006 


(0.002-0.012) 


100 


10 


0.06 


0.03 


0.06 


0.008 


(0.005-0.011) 


0.016 


(0.013-0.020) 


0.008 


(0.005-0.011) 


100 


50 


0.03 


0.03 


0.03 


0.003 


(0.001-0.006) 


0.003 


(0.001-0.006) 


0.003 


(0.001-0.006) 


100 


100 


0.03 


0.03 


0.03 


0.003 


(0.001-0.006) 


0.003 


(0.001-0.006) 


0.003 


(0.001-0.006) 



Table 1: Local linear estimation of mi with Wiener process noise. 
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n 


N 


iT'ex 


iT'as 


hcv 




^ex 




LI 




^2 


10 


10 


0.17 


0.15 


0.18 


0.050 


(0.031-0.078) 


0.050 


(0.031-0.079) 


0.051 


(0.032-0.078) 


10 


50 


0.14 


0.15 


0.15 


0.035 


(0.021-0.057) 


0.035 


(0.021-0.056) 


0.043 


(0.025-0.068) 


10 


100 


0.14 


0.15 


0.16 


0.033 


(0.021-0.054) 


0.033 


(0.021-0.053) 


0.041 


(0.026-0.065) 


50 


10 


0.09 


0.09 


0.08 


0.016 


(0.010-0.025) 


0.016 


(0.010-0.025) 


0.016 


(0.011-0.025) 


50 


50 


0.08 


0.09 


0.08 


0.009 


(0.006-0.014) 


0.009 


(0.006-0.014) 


0.010 


(0.007-0.015) 


50 


100 


0.08 


0.09 


0.08 


0.009 


(0.006-0.014) 


0.009 


(0.006-0.014) 


0.010 


(0.007-0.015) 


100 


10 


0.07 


0.07 


0.08 


0.011 


(0.007-0.016) 


0.011 


(0.007-0.016) 


0.011 


(0.007-0.016) 


100 


50 


0.06 


0.07 


0.06 


0.005 


(0.003-0.007) 


0.005 


(0.003-0.007) 


0.005 


(0.004-0.008) 


100 


100 


0.06 


0.07 


0.06 


0.005 


(0.003-0.007) 


0.005 


(0.003-0.007) 


0.005 


(0.004-0.008) 



Table 2: Local linear estimation of mi with Ornstein-Uhlenbeck process noise. 



n 


A^ 


i^ex 


i^as 


hcv 




7"2 

^ex 




Lis 




Ll 


10 


10 


OO 


0.28 


oo 


0.026 


(0.017-0.044) 


0.034 


(0.020-0.059) 


0.029 


(0.018-0.050) 


10 


50 


oo 


0.28 


oo 


0.024 


(0.015-0.039) 


0.025 


(0.016-0.040) 


0.027 


(0.016-0.044) 


10 


100 


OO 


0.28 


oo 


0.025 


(0.015-0.041) 


0.026 


(0.016-0.043) 


0.029 


(0.017-0.048) 


50 


10 


0.16 


0.16 


0.20 


0.010 


(0.006-0.016) 


0.010 


(0.006-0.016) 


0.012 


(0.007-0.242) 


50 


50 


0.12 


0.16 


0.14 


0.008 


(0.005-0.012) 


0.008 


(0.005-0.012) 


0.010 


(0.006-0.018) 


50 


100 


0.12 


0.16 


0.14 


0.008 


(0.005-0.012) 


0.008 


(0.005-0.012) 


0.010 


(0.006-0.018) 


100 


10 


0.12 


0.13 


0.14 


0.006 


(0.004-0.009) 


0.006 


(0.004-0.009) 


0.006 


(0.004-0.009) 


100 


50 


0.09 


0.13 


0.10 


0.004 


(0.003-0.006) 


0.005 


(0.003-0.007) 


0.005 


(0.003-0.007) 


100 


100 


0.09 


0.13 


0.10 


0.004 


(0.003-0.006) 


0.004 


(0.003-0.007) 


0.005 


(0.003-0.007) 



Table 3: Local linear estimation of m2 with Ornstein-Uhlenbeck process noise. 
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n 


N 


i^ex 


i^as 


hcv 




^L 




Las 




Ll 


10 


10 


0.07 


0.04 


0.08 


1.97 


1.52-2.61) 


3.97 


(3.72-4.30) 


2.05 


1.59-2.72) 


10 


50 


0.05 


0.04 


0.07 


0.84 


;0.62-1.13) 


0.88 


(0.64-1.15) 


1.07 


;0.77-1.50) 


10 


100 


0.05 


0.04 


0.07 


0.84 


;0.60-1.13) 


0.90 


(0.64-1.15) 


1.13 


;0.79-1.60) 


50 


10 


0.06 


0.03 


0.06 


1.63 


1.42-1.86) 


4.18 


(4.14-4.24) 


1.65 


1.43-1.87) 


50 


50 


0.03 


0.03 


0.03 


0.29 


;0.22-0.37) 


0.28 


(0.22-0.37) 


0.30 


;0.23-0.39) 


50 


100 


0.03 


0.03 


0.03 


0.26 


;0.20-0.33) 


0.26 


(0.20-0.33) 


0.28 


;0.21-0.36) 


100 


10 


0.06 


0.03 


0.06 


1.60 


1.45-1.77) 


3.73 


(3.69-3.76) 


1.61 


1.47-1.78) 


100 


50 


0.02 


0.02 


0.03 


0.18 


;0. 14-0.23) 


0.18 


(0.14-0.22) 


0.18 


;0.14-0.22) 


100 


100 


0.02 


0.02 


0.03 


0.17 


;0.13-0.20) 


0.17 


(0.13-0.20) 


0.17 


;0.13-0.21) 



Tabic 4: Local linear estimation of m'^ with Wiener process noise. 



n 


A^ 


hex 


has 


'f'CV 




/"2 


Lis 




Ll 


10 


10 


0.11 


0.04 


0.08 


0.96 


(0.59-1.71) 


23.0 (11.1-50.1) 


3.41 


(1.39-6.89) 


10 


50 


0.09 


0.04 


0.07 


0.52 


(0.33-0.76) 


0.92 (0.67-1.21) 


0.64 


(0.42-0.94) 


10 


100 


0.09 


0.04 


0.07 


0.52 


(0.33-0.78) 


0.89 (0.64-1.17) 


0.63 


(0.42-0.91) 


50 


10 


0.09 


0.03 


0.06 


0.52 


(0.30-0.81) 


1940 (536-4894) 


8.27 


(7.76-8.90) 


50 


50 


0.06 


0.03 


0.03 


0.15 


(0.10-0.21) 


0.26 (0.19-0.35) 


0.23 


(0.17-0.32) 


50 


100 


0.06 


0.03 


0.03 


0.14 


(0.09-0.19) 


0.23 (0.18-0.29) 


0.20 


(0.15-0.26) 


100 


10 


0.09 


0.03 


0.06 


0.48 


(0.32-0.69) 


23.8 (15.7-33.0) 


8.20 


(7.82-8.64) 


100 


50 


0.05 


0.02 


0.03 


0.08 


(0.06-0.12) 


0.15 (0.12-0.20) 


0.15 


(0.12-0.20) 


100 


100 


0.05 


0.02 


0.03 


0.08 


(0.06-0.11) 


0.13 (0.10-0.17) 


0.13 


(0.10-0.16) 



Table 5: Local quadratic estimation of m'l with Wiener process noise. 
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T — I — I — I — I — I — I — r 

0.05 0.15 0.25 0.35 




T — I — I — I — I — I — I — r 

0.05 0.15 0.25 0.35 




T 1 — I — I — I — I — I — r 

0.05 0.15 0.25 0.35 



Figure 2: Comparison of local linear (solid line) and local quadratic fitting (dashed line) 
for the estimation of derivatives. The estimation target is m'2 and the covariance function 
is p2, with n = iV = 50 in ([1]). 
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